-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: upgrade Spark to 3.4, and Deequ to 2.0.5 #168
Conversation
The test failures are because some new optional parameters were introduced with the new version of Deequ (e.g. the analyzerOptions). Today the Python land cannot leverage the default parameters in Scala land... so it throws an error. If the interface have to be an exact match, the code will bifurcate (since older version of Deequ won't have this parameter) e.g. below fixed the issue for deequ 2.0.5 but broken deequ <2.0.5 - self._Check = self._Check.hasMaxLength(column, assertion_func, hint)
+ analyzer_options = self._jvm.scala.Option.apply(None)
+ self._Check = self._Check.hasMaxLength(column, assertion_func, hint, analyzer_options) Test failures:
Edit: it seems hard even to call Scala defaults from Java... we might have to define multiple methods in Scala land without those default arguments |
We have been busy with re:invent - expect some progress in Dec. |
Hi, is there a potential date for supporting Spark 3.4 ? :) Is it more December/January or even later ? |
Hello, @chenliu0831! Is there a expected date for supporting this Spark version? Or maybe 3.5? |
Hi is there any update on this please? |
Hi All, i created a new pull request to accommodate spark 3.4 version and deequ later than 2.0.3. Welcome to take a look. |
@anqini thanks so much for looking into this and submit the PR. Unfortunately, we cannot drop the support to older Spark/Deequ version yet. I will take a closer look in #178. All - we have discussed with Deequ team and we will be working on a longer term solution including supporting plan for older Spark versions. There's no ETAs yet (some plan in Jan) but good news is we merged the maintainer groups from both repo. I will be looking into if we can have a safe short term solution in PyDeequ only this weekend. |
Hi, @chenliu0831! Any news on this upgrade? |
Hello all! Any news on version upgrade? |
Hi. Is there any update on this please? |
Closing this for now, see my comments in #192 (comment) and we can provide updates there |
Issue #, if available: #151
Description of changes:
Upgrade Spark to 3.4 and Deequ to 2.0.5
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.